39 research outputs found

    Multi-Dimensional Inheritance

    Full text link
    In this paper, we present an alternative approach to multiple inheritance for typed feature structures. In our approach, a feature structure can be associated with several types coming from different hierarchies (dimensions). In case of multiple inheritance, a type has supertypes from different hierarchies. We contrast this approach with approaches based on a single type hierarchy where a feature structure has only one unique most general type, and multiple inheritance involves computation of greatest lower bounds in the hierarchy. The proposed approach supports current linguistic analyses in constraint-based formalisms like HPSG, inheritance in the lexicon, and knowledge representation for NLP systems. Finally, we show that multi-dimensional inheritance hierarchies can be compiled into a Prolog term representation, which allows to compute the conjunction of two types efficiently by Prolog term unification.Comment: 9 pages, styles: a4,figfont,eepic,eps

    Corpora and evaluation tools for multilingual named entity grammar development

    Get PDF
    We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats

    Data-centric view in e-Science information systems

    No full text
    Network approaches in Current Research Information Systems support the shift from a document-centric to a data-centric view, which acknowledges the primacy of data in the scientific process. E-science holds the promise of a complete, data-centred documentation of the scientific process

    Bottom-up Earley deduction

    No full text
    We propose a bottom-up variant of Earley deduction. Bottom-up deduction is preferable to top-down deduction because it aJlows incremen- tal processing (even for head-driven grammars), it is data-driven, no subsumption check is needed, and preference values attached to lexical items can be used to guide best-first search. We discuss the scanning step for bottom-up Earley deduction and indexing schemes that help avoid useless deduc- tion steps

    Why Natural Language Processing Needs Oz

    No full text
    this paper is to survey the requirements that natural language processing (NLP) has on a programming language, evaluate to what extent they are satised by various programming logic programming languages, and in particular by the Oz language. It turns out that Oz appers to be a promising candidate for NLP implementations. ########| Natural Language Processing, Grammar Formalisms, Oz 1. Introduction For a long time, NLP has been in a love-hate relationship with logic programming in general, and with Prolog in particular. Let's consider the positive sides rst: # Prolog is a declarative language # Prolog provides useful data structures such as trees (terms), lists, and logical variables # Unication comes for free # Search comes for free It's nice to remember the enjoyment of writing one's rst DCG and having it parse and generate sentences after just some minutes of development time. And indeed many useful natural
    corecore